Resynchronizing Jittery AES Power Traces

What happens if things aren't as clean as we made them out to be? We can use preprocessing modules!

In [1]:
SCOPETYPE = 'CWNANO'
PLATFORM = 'CWNANO'
CRYPTO_TARGET = 'TINYAES128C'
num_traces = 250
CHECK_CORR = False

Capturing Jittery Traces

Rebuilding New Firmware

In file chipwhisperer/hardware/victims/firmware/simpleserial-aes/simpleserial-aes.c find this:

uint8_t get_pt(uint8_t* pt)
{
    trigger_high();
    aes_indep_enc(pt); /* encrypting the data block */
    trigger_low();
    simpleserial_put('r', 16, pt);
    return 0x00;
}

and add some random delay:

uint8_t get_pt(uint8_t* pt)
{
    trigger_high();
       for(volatile uint8_t k = 0; k < (*pt & 0x0F); k++);
    aes_indep_enc(pt); /* encrypting the data block */
    trigger_low();
    simpleserial_put('r', 16, pt);
    return 0x00;
}

This deterministic delay is NOT a good countermeasure, but is much easier to write in a single line since we don’t have a CSPRNG linked in. We’ll break the jitter without relying on the deterministic aspect though, so our attack would work against a better jitter source.

Be sure to remove this function afterwards so you don't break your code!

We can build the code (change the platform as needed), and confirm the output of the following works as you expect:

In [2]:
%%bash -s "$PLATFORM" "$CRYPTO_TARGET"
cd ../../hardware/victims/firmware/simpleserial-aes
make PLATFORM=$1 CRYPTO_TARGET=$2 EXTRA_OPTS=ADD_JITTER
rm -f -- simpleserial-aes-CWNANO.hex

rm -f -- simpleserial-aes-CWNANO.eep

rm -f -- simpleserial-aes-CWNANO.cof

rm -f -- simpleserial-aes-CWNANO.elf

rm -f -- simpleserial-aes-CWNANO.map

rm -f -- simpleserial-aes-CWNANO.sym

rm -f -- simpleserial-aes-CWNANO.lss

rm -f -- objdir/*.o

rm -f -- objdir/*.lst

rm -f -- simpleserial-aes.s simpleserial.s stm32f0_hal_nano.s stm32f0_hal_lowlevel.s aes.s aes-independant.s

rm -f -- simpleserial-aes.d simpleserial.d stm32f0_hal_nano.d stm32f0_hal_lowlevel.d aes.d aes-independant.d

rm -f -- simpleserial-aes.i simpleserial.i stm32f0_hal_nano.i stm32f0_hal_lowlevel.i aes.i aes-independant.i

.

-------- begin --------

arm-none-eabi-gcc (GNU Tools for Arm Embedded Processors 7-2018-q2-update) 7.3.1 20180622 (release) [ARM/embedded-7-branch revision 261907]

Copyright (C) 2017 Free Software Foundation, Inc.

This is free software; see the source for copying conditions.  There is NO

warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.



.

Compiling C: simpleserial-aes.c

arm-none-eabi-gcc -c -mcpu=cortex-m0 -I. -DADD_JITTER -mthumb -mfloat-abi=soft -ffunction-sections -gdwarf-2 -DSS_VER=SS_VER_1_1 -DSTM32F030x6 -DSTM32F0 -DSTM32 -DDEBUG -DHAL_TYPE=HAL_stm32f0_nano -DPLATFORM=CWNANO -DTINYAES128C -DF_CPU=7372800UL -Os -funsigned-char -funsigned-bitfields -fshort-enums -Wall -Wstrict-prototypes -Wa,-adhlns=objdir/simpleserial-aes.lst -I.././simpleserial/ -I.././hal -I.././hal/stm32f0 -I.././hal/stm32f0/CMSIS -I.././hal/stm32f0/CMSIS/core -I.././hal/stm32f0/CMSIS/device -I.././hal/stm32f0/Legacy -I.././crypto/ -I.././crypto/tiny-AES128-C -std=gnu99 -MMD -MP -MF .dep/simpleserial-aes.o.d simpleserial-aes.c -o objdir/simpleserial-aes.o 

.

Compiling C: .././simpleserial/simpleserial.c

arm-none-eabi-gcc -c -mcpu=cortex-m0 -I. -DADD_JITTER -mthumb -mfloat-abi=soft -ffunction-sections -gdwarf-2 -DSS_VER=SS_VER_1_1 -DSTM32F030x6 -DSTM32F0 -DSTM32 -DDEBUG -DHAL_TYPE=HAL_stm32f0_nano -DPLATFORM=CWNANO -DTINYAES128C -DF_CPU=7372800UL -Os -funsigned-char -funsigned-bitfields -fshort-enums -Wall -Wstrict-prototypes -Wa,-adhlns=objdir/simpleserial.lst -I.././simpleserial/ -I.././hal -I.././hal/stm32f0 -I.././hal/stm32f0/CMSIS -I.././hal/stm32f0/CMSIS/core -I.././hal/stm32f0/CMSIS/device -I.././hal/stm32f0/Legacy -I.././crypto/ -I.././crypto/tiny-AES128-C -std=gnu99 -MMD -MP -MF .dep/simpleserial.o.d .././simpleserial/simpleserial.c -o objdir/simpleserial.o 

.

Compiling C: .././hal/stm32f0_nano/stm32f0_hal_nano.c

arm-none-eabi-gcc -c -mcpu=cortex-m0 -I. -DADD_JITTER -mthumb -mfloat-abi=soft -ffunction-sections -gdwarf-2 -DSS_VER=SS_VER_1_1 -DSTM32F030x6 -DSTM32F0 -DSTM32 -DDEBUG -DHAL_TYPE=HAL_stm32f0_nano -DPLATFORM=CWNANO -DTINYAES128C -DF_CPU=7372800UL -Os -funsigned-char -funsigned-bitfields -fshort-enums -Wall -Wstrict-prototypes -Wa,-adhlns=objdir/stm32f0_hal_nano.lst -I.././simpleserial/ -I.././hal -I.././hal/stm32f0 -I.././hal/stm32f0/CMSIS -I.././hal/stm32f0/CMSIS/core -I.././hal/stm32f0/CMSIS/device -I.././hal/stm32f0/Legacy -I.././crypto/ -I.././crypto/tiny-AES128-C -std=gnu99 -MMD -MP -MF .dep/stm32f0_hal_nano.o.d .././hal/stm32f0_nano/stm32f0_hal_nano.c -o objdir/stm32f0_hal_nano.o 

.

Compiling C: .././hal/stm32f0/stm32f0_hal_lowlevel.c

arm-none-eabi-gcc -c -mcpu=cortex-m0 -I. -DADD_JITTER -mthumb -mfloat-abi=soft -ffunction-sections -gdwarf-2 -DSS_VER=SS_VER_1_1 -DSTM32F030x6 -DSTM32F0 -DSTM32 -DDEBUG -DHAL_TYPE=HAL_stm32f0_nano -DPLATFORM=CWNANO -DTINYAES128C -DF_CPU=7372800UL -Os -funsigned-char -funsigned-bitfields -fshort-enums -Wall -Wstrict-prototypes -Wa,-adhlns=objdir/stm32f0_hal_lowlevel.lst -I.././simpleserial/ -I.././hal -I.././hal/stm32f0 -I.././hal/stm32f0/CMSIS -I.././hal/stm32f0/CMSIS/core -I.././hal/stm32f0/CMSIS/device -I.././hal/stm32f0/Legacy -I.././crypto/ -I.././crypto/tiny-AES128-C -std=gnu99 -MMD -MP -MF .dep/stm32f0_hal_lowlevel.o.d .././hal/stm32f0/stm32f0_hal_lowlevel.c -o objdir/stm32f0_hal_lowlevel.o 

.

Compiling C: .././crypto/tiny-AES128-C/aes.c

arm-none-eabi-gcc -c -mcpu=cortex-m0 -I. -DADD_JITTER -mthumb -mfloat-abi=soft -ffunction-sections -gdwarf-2 -DSS_VER=SS_VER_1_1 -DSTM32F030x6 -DSTM32F0 -DSTM32 -DDEBUG -DHAL_TYPE=HAL_stm32f0_nano -DPLATFORM=CWNANO -DTINYAES128C -DF_CPU=7372800UL -Os -funsigned-char -funsigned-bitfields -fshort-enums -Wall -Wstrict-prototypes -Wa,-adhlns=objdir/aes.lst -I.././simpleserial/ -I.././hal -I.././hal/stm32f0 -I.././hal/stm32f0/CMSIS -I.././hal/stm32f0/CMSIS/core -I.././hal/stm32f0/CMSIS/device -I.././hal/stm32f0/Legacy -I.././crypto/ -I.././crypto/tiny-AES128-C -std=gnu99 -MMD -MP -MF .dep/aes.o.d .././crypto/tiny-AES128-C/aes.c -o objdir/aes.o 

.

Compiling C: .././crypto/aes-independant.c

arm-none-eabi-gcc -c -mcpu=cortex-m0 -I. -DADD_JITTER -mthumb -mfloat-abi=soft -ffunction-sections -gdwarf-2 -DSS_VER=SS_VER_1_1 -DSTM32F030x6 -DSTM32F0 -DSTM32 -DDEBUG -DHAL_TYPE=HAL_stm32f0_nano -DPLATFORM=CWNANO -DTINYAES128C -DF_CPU=7372800UL -Os -funsigned-char -funsigned-bitfields -fshort-enums -Wall -Wstrict-prototypes -Wa,-adhlns=objdir/aes-independant.lst -I.././simpleserial/ -I.././hal -I.././hal/stm32f0 -I.././hal/stm32f0/CMSIS -I.././hal/stm32f0/CMSIS/core -I.././hal/stm32f0/CMSIS/device -I.././hal/stm32f0/Legacy -I.././crypto/ -I.././crypto/tiny-AES128-C -std=gnu99 -MMD -MP -MF .dep/aes-independant.o.d .././crypto/aes-independant.c -o objdir/aes-independant.o 

.

Assembling: .././hal/stm32f0/stm32f0_startup.S

arm-none-eabi-gcc -c -mcpu=cortex-m0 -I. -x assembler-with-cpp -mthumb -mfloat-abi=soft -ffunction-sections -DF_CPU=7372800 -Wa,-gstabs,-adhlns=objdir/stm32f0_startup.lst -I.././simpleserial/ -I.././hal -I.././hal/stm32f0 -I.././hal/stm32f0/CMSIS -I.././hal/stm32f0/CMSIS/core -I.././hal/stm32f0/CMSIS/device -I.././hal/stm32f0/Legacy -I.././crypto/ -I.././crypto/tiny-AES128-C .././hal/stm32f0/stm32f0_startup.S -o objdir/stm32f0_startup.o

.

Linking: simpleserial-aes-CWNANO.elf

arm-none-eabi-gcc -mcpu=cortex-m0 -I. -DADD_JITTER -mthumb -mfloat-abi=soft -ffunction-sections -gdwarf-2 -DSS_VER=SS_VER_1_1 -DSTM32F030x6 -DSTM32F0 -DSTM32 -DDEBUG -DHAL_TYPE=HAL_stm32f0_nano -DPLATFORM=CWNANO -DTINYAES128C -DF_CPU=7372800UL -Os -funsigned-char -funsigned-bitfields -fshort-enums -Wall -Wstrict-prototypes -Wa,-adhlns=objdir/simpleserial-aes.o -I.././simpleserial/ -I.././hal -I.././hal/stm32f0 -I.././hal/stm32f0/CMSIS -I.././hal/stm32f0/CMSIS/core -I.././hal/stm32f0/CMSIS/device -I.././hal/stm32f0/Legacy -I.././crypto/ -I.././crypto/tiny-AES128-C -std=gnu99 -MMD -MP -MF .dep/simpleserial-aes-CWNANO.elf.d objdir/simpleserial-aes.o objdir/simpleserial.o objdir/stm32f0_hal_nano.o objdir/stm32f0_hal_lowlevel.o objdir/aes.o objdir/aes-independant.o objdir/stm32f0_startup.o --output simpleserial-aes-CWNANO.elf --specs=nano.specs --specs=nosys.specs -T .././hal/stm32f0_nano/LinkerScript.ld -Wl,--gc-sections -lm -mthumb -mcpu=cortex-m0  -Wl,-Map=simpleserial-aes-CWNANO.map,--cref   -lm  

.

Creating load file for Flash: simpleserial-aes-CWNANO.hex

arm-none-eabi-objcopy -O ihex -R .eeprom -R .fuse -R .lock -R .signature simpleserial-aes-CWNANO.elf simpleserial-aes-CWNANO.hex

.

Creating load file for EEPROM: simpleserial-aes-CWNANO.eep

arm-none-eabi-objcopy -j .eeprom --set-section-flags=.eeprom="alloc,load" \

	--change-section-lma .eeprom=0 --no-change-warnings -O ihex simpleserial-aes-CWNANO.elf simpleserial-aes-CWNANO.eep || exit 0

.

Creating Extended Listing: simpleserial-aes-CWNANO.lss

arm-none-eabi-objdump -h -S -z simpleserial-aes-CWNANO.elf > simpleserial-aes-CWNANO.lss

.

Creating Symbol Table: simpleserial-aes-CWNANO.sym

arm-none-eabi-nm -n simpleserial-aes-CWNANO.elf > simpleserial-aes-CWNANO.sym

Size after:

   text	   data	    bss	    dec	    hex	filename

   5044	    536	   1480	   7060	   1b94	simpleserial-aes-CWNANO.elf

+--------------------------------------------------------

+ Built for platform CWNANO STM32F030

+--------------------------------------------------------

simpleserial-aes.c: In function 'get_pt':

simpleserial-aes.c:42:3: warning: this 'for' clause does not guard... [-Wmisleading-indentation]

   for (volatile uint8_t k = 0; k < (*pt & 0x0F); k++);

   ^~~

simpleserial-aes.c:45:2: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'for'

  aes_indep_enc(pt); /* encrypting the data block */

  ^~~~~~~~~~~~~

Setup

Now let's go ahead. We'll have to program the file we built, so be sure to confirm we are using the right file!

In [3]:
%run "Helper_Scripts/Setup.ipynb"
In [4]:
import os, time

fw_path = '../../hardware/victims/firmware/simpleserial-aes/simpleserial-aes-{}.hex'.format(PLATFORM)

modtime = os.path.getmtime(fw_path)
print("File build time: {:s} (built {:.2f} mins ago)".format(str(time.ctime(modtime)), (time.time() - modtime)/60.0))
File build time: Fri Jun 14 15:06:25 2019 (built 0.02 mins ago)
In [5]:
cw.programTarget(scope, prog, fw_path)
Detected unknown STM32F ID: 0x445
Extended erase (0x44), this can take ten seconds or more
Attempting to programming 5579 bytes at 0x8000000
STM32F Programming flash...
STM32F Reading flash...
Verified flash OK, 5579 bytes

In addition, before we capture our traces, we'll need to create a ChipWhipserer project, since that's what Analyzer expects for an input:

In [6]:
project = cw.createProject("projects/jupyter_test_jittertime.cwp", overwrite = True)

And we can get the class used to hold our traces by:

In [7]:
tc = project.newSegment()

Capturing Traces

Below you can see the capture loop. The main body of the loop loads some new plaintext, arms the scope, sends the key and plaintext, then finally records and our new trace into our trace class. We'll also keep track of our keys manually for checking our answer later.

In [8]:
#Capture Traces
from tqdm import tnrange
import numpy as np
import time

ktp = cw.ktp.Basic(target=target)

keys = []
target.init()
for i in tnrange(num_traces, desc='Capturing traces'):
    # run aux stuff that should come before trace here

    key, text = ktp.newPair()  # manual creation of a key, text pair can be substituted here
    keys.append(key)

    #target.reinit()

    target.setModeEncrypt()  # only does something for targets that support it
    target.loadEncryptionKey(key)
    target.loadInput(text)

    # run aux stuff that should run before the scope arms here

    scope.arm()

    # run aux stuff that should run after the scope arms here

    target.go()
    timeout = 50
    # wait for target to finish
    while target.isDone() is False and timeout:
        timeout -= 1
        time.sleep(0.01)

    try:
        ret = scope.capture()
        if ret:
            print('Timeout happened during acquisition')
    except IOError as e:
        print('IOError: %s' % str(e))

    # run aux stuff that should happen after trace here
    _ = target.readOutput()  # clears the response from the serial port
    #traces.append(scope.getLastTrace())
    tc.addTrace(scope.getLastTrace(), text, "", key)

Now that we have our traces, we need to tell the project that the traces are loaded and add them to the project's trace manager.

In [9]:
project.appendSegment(tc)

#Save project file
project.save()

We're now done with the ChipWhisperer hardware, so we should disconnect from the scope and target:

In [10]:
# cleanup the connection to the target and scope
scope.dis()
target.dis()

Analysis

To fix the jitter, we'll need to add our traces to a preprocessing module. We can feed project.traceManager() right into attack.setTraceSource(), but we could also add pre-processing inbetween (more about this later). We'll also re-open the traces, in this case it is required since the call to closeAll() would have flushed the buffers.

In [11]:
#Force reload of project data (if you comment out 'closeAll()' this isn't needed)

#We also rebuild the project object in case you only want to run this half
import chipwhisperer as cw
project = cw.openProject("projects/jupyter_test_jittertime.cwp")

This time we're going to do a few things. First we will get the traces, and plot a few of them as-is. You can adjust the traces plotted by adjusting the range(10). For example range(1) plots the first trace.

In [12]:
tm = project.traceManager()

from bokeh.plotting import figure, show
from bokeh.io import output_notebook
from bokeh.palettes import Dark2_5 as palette
import itertools  

output_notebook()
p = figure(sizing_mode='scale_width', plot_height=300)

# create a color iterator
colors = itertools.cycle(palette)  

x_range = range(0, tm.numPoints())
for i, color in zip(range(10), colors): #Adjust range(n) to plot certain traces
    p.line(x_range, tm.getTrace(i), color=color)
show(p)
Loading BokehJS ...

So how do we fix that? To begin with, you should plot only a single trace to make your life more clear. You'll need to figure out a very unique area. For example see the following figure showing a single plot. In this example the location of A is unique, but B would have many matches within that same trace, even nearby: Resync example trace

We will specify two items:

  • A window with the "unique" area defined.
  • How far we will shift the window (+/- points) to search for the best match.

You can use the following code to define the target_window and max_shift. Try a few values until you find something that works. Also try some poor example, and also try plotting more traces to confirm your match is working in real life.

In [13]:
resync_traces = cw.preprocessing.ResyncSAD(tm, connectTracePlot=False)
resync_traces.enabled = True
resync_traces.ref_trace = 0

if PLATFORM == "CWNANO":
    #Define a target window here. 500,900 for example is good based on above. But try some different values.
    resync_traces.target_window = (300, 700)

    # Define max_shift. Must not cause target_window to go outside of valid data. Try 16-600 range. Ideal value varies with how
    # much jitter is in original data. 
    resync_traces.max_shift = 300
elif PLATFORM == "CWLITEXMEGA" or PLATFORM == "CW303":
    #Define a target window here. 500,900 for example is good based on above. But try some different values.
    resync_traces.target_window = (1000, 1400)

    # Define max_shift. Must not cause target_window to go outside of valid data. Try 16-600 range. Ideal value varies with how
    # much jitter is in original data. 
    resync_traces.max_shift = 1000
else:
    #Define a target window here. 500,900 for example is good based on above. But try some different values.
    resync_traces.target_window = (700, 1500)

    # Define max_shift. Must not cause target_window to go outside of valid data. Try 16-600 range. Ideal value varies with how
    # much jitter is in original data. 
    resync_traces.max_shift = 700

#Uses objects from previous cells (plotting etc), so 
output_notebook()
p = figure()

for i, color in zip(range(10), colors):
    p.line(x_range, resync_traces.getTrace(i), color=color)
show(p)

preprocessed_traces = resync_traces
Loading BokehJS ...

If this all works - let's just continue the attack! Do so as below:

In [14]:
leak_model = cw.AES128(cw.aes128leakage.SBox_output)
attack = cw.cpa(preprocessed_traces, leak_model)

And then actually run it:

In [15]:
cb = cw.getJupyterCallback(attack)
attack_results = attack.processTracesNoGUI(cb)
Finished traces 240 to 250
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
PGE= 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 2B
0.702
7E
0.755
15
0.731
16
0.790
28
0.749
AE
0.715
D2
0.727
A6
0.746
AB
0.808
F7
0.807
15
0.790
88
0.705
09
0.716
CF
0.753
4F
0.732
3C
0.758
1 19
0.318
62
0.312
40
0.294
CB
0.325
67
0.314
87
0.323
89
0.334
57
0.298
B6
0.346
52
0.317
50
0.312
94
0.312
0C
0.326
2E
0.309
C8
0.303
26
0.299
2 5F
0.315
C9
0.310
A4
0.282
BE
0.309
30
0.291
73
0.294
58
0.302
F5
0.295
63
0.306
56
0.309
83
0.291
C0
0.307
16
0.321
F2
0.301
5B
0.302
2F
0.296
3 3F
0.298
70
0.301
26
0.279
BF
0.307
A1
0.287
A3
0.282
D7
0.295
1C
0.289
E7
0.295
2B
0.307
C0
0.285
DA
0.290
D6
0.312
70
0.293
03
0.302
98
0.295
4 4A
0.296
BA
0.297
7B
0.277
8E
0.300
73
0.287
6F
0.282
B5
0.287
47
0.289
FA
0.281
25
0.302
42
0.283
18
0.289
53
0.309
60
0.291
65
0.299
9C
0.289

You should see the PGE reach 0 for each byte. If not, you might need to adjust the SAD resync. You could also need to increase the length of the sample capture for example. You may notice that it starts working OK and then fails, due to later traces become unsychronized.

Plotting Correlation Output

In [16]:
from bokeh.plotting import figure, show
from bokeh.io import output_notebook

attack_results = attack.getStatistics()
plot_data = cw.analyzerPlots(attack_results)
bnum = 0

ret = plot_data.outputVsTime(bnum)

output_notebook()
p = figure()
p.line(ret[0], ret[2], line_color='green')
p.line(ret[0], ret[3], line_color='green')

p.line(ret[0], ret[1], line_color='red')
show(p)
c:\users\user\appdata\local\programs\python\python37-32\lib\site-packages\numpy\core\fromnumeric.py:83: RuntimeWarning: invalid value encountered in reduce
  return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
Loading BokehJS ...

You should see a graph of red and green in time (samples). In red is the correlation of the correct subkey for the first byte, while the rest are in green.

You should see two or three distinctive red spikes. The first is the spot where the sbox lookup for the subkey we guessed actually happens (the later ones are from later steps in the AES operation).

What about the rest of the bytes in the key? We can get and plot that easily as well:

In [17]:
rets = []
for i in range(0, 16):
    rets.append(plot_data.outputVsTime(i))

p = figure()
for ret in rets:
    p.line(ret[0], ret[2], line_color='green')
    p.line(ret[0], ret[3], line_color='green')
    
for ret in rets:
    p.line(ret[0], ret[1], line_color='red')

show(p)

Conclusion

Awesome! You should have now completed a resynchronization of power traces. This is a very useful tool, and you can see how making a simple class could extend this work.

Tests

In [18]:
key = project.traceManager().getKnownKey(0)
recv_key = [kguess[0][0] for kguess in attack_results.findMaximums()]
assert (key == recv_key).all(), "Failed to recover encryption key\nGot: {}\nExpected: {}".format(recv_key, key)
In [19]:
assert (attack_results.pge == [0]*16), "PGE for some bytes not zero: {}".format(attack_results.pge)
In [20]:
if CHECK_CORR:
    max_corrs = [kguess[0][2] for kguess in attack_results.findMaximums()]
    assert (np.all([corr > 0.75 for corr in max_corrs])), "Low correlation in attack (corr <= 0.75): {}".format(max_corrs)